Dataset statistics
| Number of variables | 13 |
|---|---|
| Number of observations | 361037 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 35.8 MiB |
| Average record size in memory | 104.0 B |
Variable types
| Numeric | 8 |
|---|---|
| Categorical | 5 |
imp_hash has constant value "25c7ac00c91884fd2923a489ae9dfbca" | Constant |
filename has a high cardinality: 733 distinct values | High cardinality |
sha256 has a high cardinality: 10251 distinct values | High cardinality |
sec_md5 has a high cardinality: 9182 distinct values | High cardinality |
sec_name has a high cardinality: 15918 distinct values | High cardinality |
df_index is highly correlated with Unnamed: 0 and 1 other fields | High correlation |
Unnamed: 0 is highly correlated with df_index and 1 other fields | High correlation |
win_count is highly correlated with df_index and 1 other fields | High correlation |
sec_entropy is highly correlated with virtual_address | High correlation |
raw_size is highly correlated with virtual_size | High correlation |
virtual_size is highly correlated with raw_size and 1 other fields | High correlation |
virtual_address is highly correlated with sec_entropy and 1 other fields | High correlation |
df_index is highly correlated with Unnamed: 0 and 1 other fields | High correlation |
Unnamed: 0 is highly correlated with df_index and 1 other fields | High correlation |
win_count is highly correlated with df_index and 1 other fields | High correlation |
sec_chi2 is highly correlated with raw_size and 1 other fields | High correlation |
sec_entropy is highly correlated with raw_size and 2 other fields | High correlation |
raw_size is highly correlated with sec_chi2 and 2 other fields | High correlation |
virtual_size is highly correlated with sec_chi2 and 2 other fields | High correlation |
virtual_address is highly correlated with sec_entropy | High correlation |
df_index is highly correlated with Unnamed: 0 and 1 other fields | High correlation |
Unnamed: 0 is highly correlated with df_index and 1 other fields | High correlation |
win_count is highly correlated with df_index and 1 other fields | High correlation |
raw_size is highly correlated with virtual_size | High correlation |
virtual_size is highly correlated with raw_size | High correlation |
df_index is highly correlated with Unnamed: 0 and 1 other fields | High correlation |
Unnamed: 0 is highly correlated with df_index and 1 other fields | High correlation |
win_count is highly correlated with df_index and 1 other fields | High correlation |
sec_chi2 is highly correlated with raw_size and 2 other fields | High correlation |
sec_entropy is highly correlated with raw_size and 2 other fields | High correlation |
raw_size is highly correlated with sec_chi2 and 3 other fields | High correlation |
virtual_size is highly correlated with sec_chi2 and 3 other fields | High correlation |
virtual_address is highly correlated with sec_chi2 and 3 other fields | High correlation |
df_index has unique values | Unique |
Unnamed: 0 has unique values | Unique |
sec_entropy has 259677 (71.9%) zeros | Zeros |
Reproduction
| Analysis started | 2022-08-22 03:18:26.317599 |
|---|---|
| Analysis finished | 2022-08-22 03:18:41.522832 |
| Duration | 15.21 seconds |
| Software version | pandas-profiling v3.2.0 |
| Download configuration | config.json |
df_index
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONUNIQUE| Distinct | 361037 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4440196.256 |
| Minimum | 890523 |
|---|---|
| Maximum | 5674752 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.8 MiB |
Quantile statistics
| Minimum | 890523 |
|---|---|
| 5-th percentile | 3535314.8 |
| Q1 | 4263319 |
| median | 4514605 |
| Q3 | 4776000 |
| 95-th percentile | 5097902.2 |
| Maximum | 5674752 |
| Range | 4784229 |
| Interquartile range (IQR) | 512681 |
Descriptive statistics
| Standard deviation | 622718.1196 |
|---|---|
| Coefficient of variation (CV) | 0.1402456296 |
| Kurtosis | 11.46646499 |
| Mean | 4440196.256 |
| Median Absolute Deviation (MAD) | 254991 |
| Skewness | -2.921966442 |
| Sum | 1.603075136 × 1012 |
| Variance | 3.877778565 × 1011 |
| Monotonicity | Strictly increasing |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 890523 | 1 | < 0.1% |
| 4683030 | 1 | < 0.1% |
| 4682908 | 1 | < 0.1% |
| 4682907 | 1 | < 0.1% |
| 4682906 | 1 | < 0.1% |
| 4682887 | 1 | < 0.1% |
| 4682886 | 1 | < 0.1% |
| 4682885 | 1 | < 0.1% |
| 4682884 | 1 | < 0.1% |
| 4682883 | 1 | < 0.1% |
| Other values (361027) | 361027 |
| Value | Count | Frequency (%) |
| 890523 | 1 | |
| 890524 | 1 | |
| 890525 | 1 | |
| 890526 | 1 | |
| 890527 | 1 | |
| 890528 | 1 | |
| 890529 | 1 | |
| 890530 | 1 | |
| 890531 | 1 | |
| 890532 | 1 |
| Value | Count | Frequency (%) |
| 5674752 | 1 | |
| 5674751 | 1 | |
| 5674750 | 1 | |
| 5674749 | 1 | |
| 5674748 | 1 | |
| 5674747 | 1 | |
| 5674746 | 1 | |
| 5674745 | 1 | |
| 5674744 | 1 | |
| 5674743 | 1 |
Unnamed: 0
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONUNIQUE| Distinct | 361037 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4440196.256 |
| Minimum | 890523 |
|---|---|
| Maximum | 5674752 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.8 MiB |
Quantile statistics
| Minimum | 890523 |
|---|---|
| 5-th percentile | 3535314.8 |
| Q1 | 4263319 |
| median | 4514605 |
| Q3 | 4776000 |
| 95-th percentile | 5097902.2 |
| Maximum | 5674752 |
| Range | 4784229 |
| Interquartile range (IQR) | 512681 |
Descriptive statistics
| Standard deviation | 622718.1196 |
|---|---|
| Coefficient of variation (CV) | 0.1402456296 |
| Kurtosis | 11.46646499 |
| Mean | 4440196.256 |
| Median Absolute Deviation (MAD) | 254991 |
| Skewness | -2.921966442 |
| Sum | 1.603075136 × 1012 |
| Variance | 3.877778565 × 1011 |
| Monotonicity | Strictly increasing |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 890523 | 1 | < 0.1% |
| 4683030 | 1 | < 0.1% |
| 4682908 | 1 | < 0.1% |
| 4682907 | 1 | < 0.1% |
| 4682906 | 1 | < 0.1% |
| 4682887 | 1 | < 0.1% |
| 4682886 | 1 | < 0.1% |
| 4682885 | 1 | < 0.1% |
| 4682884 | 1 | < 0.1% |
| 4682883 | 1 | < 0.1% |
| Other values (361027) | 361027 |
| Value | Count | Frequency (%) |
| 890523 | 1 | |
| 890524 | 1 | |
| 890525 | 1 | |
| 890526 | 1 | |
| 890527 | 1 | |
| 890528 | 1 | |
| 890529 | 1 | |
| 890530 | 1 | |
| 890531 | 1 | |
| 890532 | 1 |
| Value | Count | Frequency (%) |
| 5674752 | 1 | |
| 5674751 | 1 | |
| 5674750 | 1 | |
| 5674749 | 1 | |
| 5674748 | 1 | |
| 5674747 | 1 | |
| 5674746 | 1 | |
| 5674745 | 1 | |
| 5674744 | 1 | |
| 5674743 | 1 |
| Distinct | 733 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.8 MiB |
| 2022041921/2022041921_40 | 2552 |
|---|---|
| 2022041921/2022041921_5 | 2237 |
| 2022041921/2022041921_7 | 2191 |
| 2022041921/2022041921_39 | 2122 |
| 2022041920/2022041920_45 | 2095 |
| Other values (728) |
Length
| Max length | 24 |
|---|---|
| Median length | 24 |
| Mean length | 23.83824096 |
| Min length | 23 |
Characters and Unicode
| Total characters | 8606487 |
|---|---|
| Distinct characters | 12 |
| Distinct categories | 3 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2022041900/2022041900_10 |
|---|---|
| 2nd row | 2022041900/2022041900_10 |
| 3rd row | 2022041900/2022041900_10 |
| 4th row | 2022041900/2022041900_10 |
| 5th row | 2022041900/2022041900_10 |
Common Values
| Value | Count | Frequency (%) |
| 2022041921/2022041921_40 | 2552 | 0.7% |
| 2022041921/2022041921_5 | 2237 | 0.6% |
| 2022041921/2022041921_7 | 2191 | 0.6% |
| 2022041921/2022041921_39 | 2122 | 0.6% |
| 2022041920/2022041920_45 | 2095 | 0.6% |
| 2022041919/2022041919_40 | 2089 | 0.6% |
| 2022041919/2022041919_1 | 2036 | 0.6% |
| 2022041920/2022041920_55 | 1983 | 0.5% |
| 2022041920/2022041920_49 | 1970 | 0.5% |
| 2022041919/2022041919_42 | 1960 | 0.5% |
| Other values (723) | 339802 |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| 2022041921/2022041921_40 | 2552 | 0.7% |
| 2022041921/2022041921_5 | 2237 | 0.6% |
| 2022041921/2022041921_7 | 2191 | 0.6% |
| 2022041921/2022041921_39 | 2122 | 0.6% |
| 2022041920/2022041920_45 | 2095 | 0.6% |
| 2022041919/2022041919_40 | 2089 | 0.6% |
| 2022041919/2022041919_1 | 2036 | 0.6% |
| 2022041920/2022041920_55 | 1983 | 0.5% |
| 2022041920/2022041920_49 | 1970 | 0.5% |
| 2022041919/2022041919_42 | 1960 | 0.5% |
| Other values (723) | 339802 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 2740391 | |
| 0 | 1686041 | |
| 1 | 1250600 | |
| 9 | 921679 | 10.7% |
| 4 | 838591 | 9.7% |
| / | 361037 | 4.2% |
| _ | 361037 | 4.2% |
| 3 | 137618 | 1.6% |
| 8 | 131154 | 1.5% |
| 5 | 99506 | 1.2% |
| Other values (2) | 78833 | 0.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 7884413 | |
| Other Punctuation | 361037 | 4.2% |
| Connector Punctuation | 361037 | 4.2% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 2740391 | |
| 0 | 1686041 | |
| 1 | 1250600 | |
| 9 | 921679 | 11.7% |
| 4 | 838591 | 10.6% |
| 3 | 137618 | 1.7% |
| 8 | 131154 | 1.7% |
| 5 | 99506 | 1.3% |
| 7 | 41268 | 0.5% |
| 6 | 37565 | 0.5% |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 361037 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 361037 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 8606487 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 2 | 2740391 | |
| 0 | 1686041 | |
| 1 | 1250600 | |
| 9 | 921679 | 10.7% |
| 4 | 838591 | 9.7% |
| / | 361037 | 4.2% |
| _ | 361037 | 4.2% |
| 3 | 137618 | 1.6% |
| 8 | 131154 | 1.5% |
| 5 | 99506 | 1.2% |
| Other values (2) | 78833 | 0.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 8606487 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2 | 2740391 | |
| 0 | 1686041 | |
| 1 | 1250600 | |
| 9 | 921679 | 10.7% |
| 4 | 838591 | 9.7% |
| / | 361037 | 4.2% |
| _ | 361037 | 4.2% |
| 3 | 137618 | 1.6% |
| 8 | 131154 | 1.5% |
| 5 | 99506 | 1.2% |
| Other values (2) | 78833 | 0.9% |
| Distinct | 10284 |
|---|---|
| Distinct (%) | 2.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 415397.3033 |
| Minimum | 1371 |
|---|---|
| Maximum | 615253 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.8 MiB |
Quantile statistics
| Minimum | 1371 |
|---|---|
| 5-th percentile | 348689 |
| Q1 | 383732 |
| median | 419650 |
| Q3 | 454941 |
| 95-th percentile | 506850 |
| Maximum | 615253 |
| Range | 613882 |
| Interquartile range (IQR) | 71209 |
Descriptive statistics
| Standard deviation | 68022.63825 |
|---|---|
| Coefficient of variation (CV) | 0.1637532013 |
| Kurtosis | 8.514812051 |
| Mean | 415397.3033 |
| Median Absolute Deviation (MAD) | 35637 |
| Skewness | -1.927899435 |
| Sum | 1.499737962 × 1011 |
| Variance | 4627079314 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 423722 | 97 | < 0.1% |
| 395177 | 96 | < 0.1% |
| 401436 | 84 | < 0.1% |
| 398996 | 83 | < 0.1% |
| 381232 | 80 | < 0.1% |
| 363563 | 77 | < 0.1% |
| 401564 | 72 | < 0.1% |
| 387013 | 67 | < 0.1% |
| 369543 | 62 | < 0.1% |
| 363207 | 60 | < 0.1% |
| Other values (10274) | 360259 |
| Value | Count | Frequency (%) |
| 1371 | 50 | |
| 3450 | 42 | |
| 3733 | 50 | |
| 4954 | 50 | |
| 5022 | 30 | |
| 5080 | 49 | |
| 5408 | 44 | |
| 5419 | 31 | |
| 5731 | 34 | |
| 6208 | 49 |
| Value | Count | Frequency (%) |
| 615253 | 48 | |
| 615063 | 32 | |
| 614731 | 32 | |
| 612544 | 50 | |
| 612494 | 19 | < 0.1% |
| 612277 | 44 | |
| 610236 | 26 | |
| 609956 | 22 | |
| 609721 | 26 | |
| 609217 | 23 |
| Distinct | 10251 |
|---|---|
| Distinct (%) | 2.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.8 MiB |
| 04d66260ccaca00fdb56608492b754a4cdd6d7be24a48c6463c0c9bf13ecfd81 | 100 |
|---|---|
| bf05399e1c937d3dfb38be3600967b703d706cbe34dd130e7e9b8c90044a4754 | 100 |
| 96ac2c6669518e0a3ad43796b1511ee5726fdd86bcc14714d9576be327b97ea1 | 100 |
| db1d37343dc606670df06f44162a63ca43d00552a790969b2393db3b3da048f0 | 100 |
| 9bd5b3351e19f00d6dee37fdd922e886a1c64712f59b20e5d729f87ce15c39a1 | 98 |
| Other values (10246) |
Length
| Max length | 64 |
|---|---|
| Median length | 64 |
| Mean length | 64 |
| Min length | 64 |
Characters and Unicode
| Total characters | 23106368 |
|---|---|
| Distinct characters | 16 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 421a0e5ff1a6ef6b4a1c79a5fecd9555ef2e0376177ab8e50b22aae1fefd78c4 |
|---|---|
| 2nd row | 421a0e5ff1a6ef6b4a1c79a5fecd9555ef2e0376177ab8e50b22aae1fefd78c4 |
| 3rd row | 421a0e5ff1a6ef6b4a1c79a5fecd9555ef2e0376177ab8e50b22aae1fefd78c4 |
| 4th row | 421a0e5ff1a6ef6b4a1c79a5fecd9555ef2e0376177ab8e50b22aae1fefd78c4 |
| 5th row | 421a0e5ff1a6ef6b4a1c79a5fecd9555ef2e0376177ab8e50b22aae1fefd78c4 |
Common Values
| Value | Count | Frequency (%) |
| 04d66260ccaca00fdb56608492b754a4cdd6d7be24a48c6463c0c9bf13ecfd81 | 100 | < 0.1% |
| bf05399e1c937d3dfb38be3600967b703d706cbe34dd130e7e9b8c90044a4754 | 100 | < 0.1% |
| 96ac2c6669518e0a3ad43796b1511ee5726fdd86bcc14714d9576be327b97ea1 | 100 | < 0.1% |
| db1d37343dc606670df06f44162a63ca43d00552a790969b2393db3b3da048f0 | 100 | < 0.1% |
| 9bd5b3351e19f00d6dee37fdd922e886a1c64712f59b20e5d729f87ce15c39a1 | 98 | < 0.1% |
| 138cf191475aa7d3c71ffd4f817b101e615618569389d094eebdd76f77f408a8 | 96 | < 0.1% |
| e26b5b9a05f172b9d6c14409fb7f63c1344c21df0fbc56191b18061c10cacbf1 | 94 | < 0.1% |
| bfba2a465cd51dd5d1bbcbaed0079c1cbbe66ef20cc60bf3aa624bdfab5f0ec8 | 92 | < 0.1% |
| e8553934404d9f5bff496180b6a10e7931962f04743fd3674534f568c13cc843 | 90 | < 0.1% |
| c017751a5a0af4a61f214148b092aa1a96716a84361b414cace0edbc2c7d6c7c | 90 | < 0.1% |
| Other values (10241) | 360077 |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| 04d66260ccaca00fdb56608492b754a4cdd6d7be24a48c6463c0c9bf13ecfd81 | 100 | < 0.1% |
| 96ac2c6669518e0a3ad43796b1511ee5726fdd86bcc14714d9576be327b97ea1 | 100 | < 0.1% |
| db1d37343dc606670df06f44162a63ca43d00552a790969b2393db3b3da048f0 | 100 | < 0.1% |
| bf05399e1c937d3dfb38be3600967b703d706cbe34dd130e7e9b8c90044a4754 | 100 | < 0.1% |
| 9bd5b3351e19f00d6dee37fdd922e886a1c64712f59b20e5d729f87ce15c39a1 | 98 | < 0.1% |
| 138cf191475aa7d3c71ffd4f817b101e615618569389d094eebdd76f77f408a8 | 96 | < 0.1% |
| e26b5b9a05f172b9d6c14409fb7f63c1344c21df0fbc56191b18061c10cacbf1 | 94 | < 0.1% |
| bfba2a465cd51dd5d1bbcbaed0079c1cbbe66ef20cc60bf3aa624bdfab5f0ec8 | 92 | < 0.1% |
| e8553934404d9f5bff496180b6a10e7931962f04743fd3674534f568c13cc843 | 90 | < 0.1% |
| c017751a5a0af4a61f214148b092aa1a96716a84361b414cace0edbc2c7d6c7c | 90 | < 0.1% |
| Other values (10241) | 360077 |
Most occurring characters
| Value | Count | Frequency (%) |
| d | 1457692 | 6.3% |
| 8 | 1451048 | 6.3% |
| 7 | 1450456 | 6.3% |
| 1 | 1449136 | 6.3% |
| 0 | 1449006 | 6.3% |
| 3 | 1445882 | 6.3% |
| a | 1445840 | 6.3% |
| c | 1444017 | 6.2% |
| 9 | 1443498 | 6.2% |
| 6 | 1441941 | 6.2% |
| Other values (6) | 8627852 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 14449882 | |
| Lowercase Letter | 8656486 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 8 | 1451048 | |
| 7 | 1450456 | |
| 1 | 1449136 | |
| 0 | 1449006 | |
| 3 | 1445882 | |
| 9 | 1443498 | |
| 6 | 1441941 | |
| 5 | 1441911 | |
| 2 | 1439061 | |
| 4 | 1437943 |
Lowercase Letter
| Value | Count | Frequency (%) |
| d | 1457692 | |
| a | 1445840 | |
| c | 1444017 | |
| f | 1441070 | |
| e | 1436031 | |
| b | 1431836 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 14449882 | |
| Latin | 8656486 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 8 | 1451048 | |
| 7 | 1450456 | |
| 1 | 1449136 | |
| 0 | 1449006 | |
| 3 | 1445882 | |
| 9 | 1443498 | |
| 6 | 1441941 | |
| 5 | 1441911 | |
| 2 | 1439061 | |
| 4 | 1437943 |
Latin
| Value | Count | Frequency (%) |
| d | 1457692 | |
| a | 1445840 | |
| c | 1444017 | |
| f | 1441070 | |
| e | 1436031 | |
| b | 1431836 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 23106368 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| d | 1457692 | 6.3% |
| 8 | 1451048 | 6.3% |
| 7 | 1450456 | 6.3% |
| 1 | 1449136 | 6.3% |
| 0 | 1449006 | 6.3% |
| 3 | 1445882 | 6.3% |
| a | 1445840 | 6.3% |
| c | 1444017 | 6.2% |
| 9 | 1443498 | 6.2% |
| 6 | 1441941 | 6.2% |
| Other values (6) | 8627852 |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.8 MiB |
| 25c7ac00c91884fd2923a489ae9dfbca |
|---|
Length
| Max length | 32 |
|---|---|
| Median length | 32 |
| Mean length | 32 |
| Min length | 32 |
Characters and Unicode
| Total characters | 11553184 |
|---|---|
| Distinct characters | 15 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 25c7ac00c91884fd2923a489ae9dfbca |
|---|---|
| 2nd row | 25c7ac00c91884fd2923a489ae9dfbca |
| 3rd row | 25c7ac00c91884fd2923a489ae9dfbca |
| 4th row | 25c7ac00c91884fd2923a489ae9dfbca |
| 5th row | 25c7ac00c91884fd2923a489ae9dfbca |
Common Values
| Value | Count | Frequency (%) |
| 25c7ac00c91884fd2923a489ae9dfbca | 361037 |
Length
Histogram of lengths of the category
Category Frequency Plot
| Value | Count | Frequency (%) |
| 25c7ac00c91884fd2923a489ae9dfbca | 361037 |
Most occurring characters
| Value | Count | Frequency (%) |
| c | 1444148 | |
| a | 1444148 | |
| 9 | 1444148 | |
| 2 | 1083111 | |
| 8 | 1083111 | |
| 0 | 722074 | 6.2% |
| 4 | 722074 | 6.2% |
| f | 722074 | 6.2% |
| d | 722074 | 6.2% |
| 5 | 361037 | 3.1% |
| Other values (5) | 1805185 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 6498666 | |
| Lowercase Letter | 5054518 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 9 | 1444148 | |
| 2 | 1083111 | |
| 8 | 1083111 | |
| 0 | 722074 | |
| 4 | 722074 | |
| 5 | 361037 | 5.6% |
| 7 | 361037 | 5.6% |
| 1 | 361037 | 5.6% |
| 3 | 361037 | 5.6% |
Lowercase Letter
| Value | Count | Frequency (%) |
| c | 1444148 | |
| a | 1444148 | |
| f | 722074 | |
| d | 722074 | |
| e | 361037 | 7.1% |
| b | 361037 | 7.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 6498666 | |
| Latin | 5054518 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 9 | 1444148 | |
| 2 | 1083111 | |
| 8 | 1083111 | |
| 0 | 722074 | |
| 4 | 722074 | |
| 5 | 361037 | 5.6% |
| 7 | 361037 | 5.6% |
| 1 | 361037 | 5.6% |
| 3 | 361037 | 5.6% |
Latin
| Value | Count | Frequency (%) |
| c | 1444148 | |
| a | 1444148 | |
| f | 722074 | |
| d | 722074 | |
| e | 361037 | 7.1% |
| b | 361037 | 7.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 11553184 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| c | 1444148 | |
| a | 1444148 | |
| 9 | 1444148 | |
| 2 | 1083111 | |
| 8 | 1083111 | |
| 0 | 722074 | 6.2% |
| 4 | 722074 | 6.2% |
| f | 722074 | 6.2% |
| d | 722074 | 6.2% |
| 5 | 361037 | 3.1% |
| Other values (5) | 1805185 |
| Distinct | 8562 |
|---|---|
| Distinct (%) | 2.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3896430.123 |
| Minimum | 37912.29 |
|---|---|
| Maximum | 73113600 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.8 MiB |
Quantile statistics
| Minimum | 37912.29 |
|---|---|
| 5-th percentile | 65485.4 |
| Q1 | 1044480 |
| median | 1044480 |
| Q3 | 2088960 |
| 95-th percentile | 7311360 |
| Maximum | 73113600 |
| Range | 73075687.71 |
| Interquartile range (IQR) | 1044480 |
Descriptive statistics
| Standard deviation | 13135118.62 |
|---|---|
| Coefficient of variation (CV) | 3.371064848 |
| Kurtosis | 23.47635445 |
| Mean | 3896430.123 |
| Median Absolute Deviation (MAD) | 362902 |
| Skewness | 5.025302897 |
| Sum | 1.406755442 × 1012 |
| Variance | 1.725313412 × 1014 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 1044480 | 157453 | |
| 2088960 | 83165 | |
| 73113600 | 12326 | 3.4% |
| 498667.88 | 10295 | 2.9% |
| 273516.75 | 10295 | 2.9% |
| 940514.5 | 10295 | 2.9% |
| 2810188 | 10295 | 2.9% |
| 134540.95 | 10295 | 2.9% |
| 65485.4 | 10290 | 2.9% |
| 4015190.25 | 10290 | 2.9% |
| Other values (8552) | 36038 | 10.0% |
| Value | Count | Frequency (%) |
| 37912.29 | 1 | < 0.1% |
| 37913.59 | 1 | < 0.1% |
| 37919.14 | 13 | < 0.1% |
| 37920 | 1 | < 0.1% |
| 37922.22 | 1 | < 0.1% |
| 37922.54 | 1 | < 0.1% |
| 37923.03 | 10277 | |
| 61303 | 1 | < 0.1% |
| 61381.88 | 1 | < 0.1% |
| 61682.75 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 73113600 | 12326 | |
| 54312960 | 245 | 0.1% |
| 7311360 | 6488 | |
| 4015315.75 | 1 | < 0.1% |
| 4015190.25 | 10290 | |
| 4015190 | 1 | < 0.1% |
| 4015189.75 | 1 | < 0.1% |
| 4015189.25 | 1 | < 0.1% |
| 4015182 | 1 | < 0.1% |
| 2810188 | 10295 |
| Distinct | 406 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.210155109 |
| Minimum | 0 |
|---|---|
| Maximum | 7.87 |
| Zeros | 259677 |
| Zeros (%) | 71.9% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.8 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0.54 |
| 95-th percentile | 7.82 |
| Maximum | 7.87 |
| Range | 7.87 |
| Interquartile range (IQR) | 0.54 |
Descriptive statistics
| Standard deviation | 2.365520452 |
|---|---|
| Coefficient of variation (CV) | 1.954725005 |
| Kurtosis | 1.816032852 |
| Mean | 1.210155109 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.801276212 |
| Sum | 436910.77 |
| Variance | 5.595687007 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 259677 | |
| 0.54 | 10318 | 2.9% |
| 4.09 | 10299 | 2.9% |
| 2.78 | 10296 | 2.9% |
| 7.82 | 10295 | 2.9% |
| 4.04 | 10295 | 2.9% |
| 6.67 | 10295 | 2.9% |
| 7.87 | 10295 | 2.9% |
| 5.14 | 10295 | 2.9% |
| 0.48 | 5616 | 1.6% |
| Other values (396) | 13356 | 3.7% |
| Value | Count | Frequency (%) |
| 0 | 259677 | |
| 0.23 | 4 | < 0.1% |
| 0.24 | 2 | < 0.1% |
| 0.31 | 1 | < 0.1% |
| 0.32 | 2 | < 0.1% |
| 0.33 | 22 | < 0.1% |
| 0.34 | 30 | < 0.1% |
| 0.36 | 2 | < 0.1% |
| 0.46 | 1 | < 0.1% |
| 0.48 | 5616 | 1.6% |
| Value | Count | Frequency (%) |
| 7.87 | 10295 | |
| 7.82 | 10295 | |
| 6.67 | 10295 | |
| 5.75 | 520 | 0.1% |
| 5.74 | 609 | 0.2% |
| 5.73 | 34 | < 0.1% |
| 5.72 | 66 | < 0.1% |
| 5.71 | 1 | < 0.1% |
| 5.57 | 10 | < 0.1% |
| 5.56 | 5 | < 0.1% |
| Distinct | 9182 |
|---|---|
| Distinct (%) | 2.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.8 MiB |
| 620f0b67a91f7f74151bc5be745b7110 | |
|---|---|
| 0829f71740aab1ab98b33eae21dee122 | |
| 4579108cda3cebc6432027a86e7b7a9b | 12326 |
| b60eaab7c709450be3cba1c56615936c | 10295 |
| a24d116ab001c6148d20035da2529014 | 10295 |
| Other values (9177) |
Length
| Max length | 32 |
|---|---|
| Median length | 32 |
| Mean length | 32 |
| Min length | 32 |
Characters and Unicode
| Total characters | 11553184 |
|---|---|
| Distinct characters | 16 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 9083 ? |
|---|---|
| Unique (%) | 2.5% |
Sample
| 1st row | 072b707ef80f7c15338cf1fd7c7212aa |
|---|---|
| 2nd row | 9f46fc5a3fcb244f69598b159edfecd8 |
| 3rd row | a24d116ab001c6148d20035da2529014 |
| 4th row | ca2ef02cb2d2c48858ceb3137e44019d |
| 5th row | 218e10d0dff11714c4062e870cc733ae |
Common Values
| Value | Count | Frequency (%) |
| 620f0b67a91f7f74151bc5be745b7110 | 157453 | |
| 0829f71740aab1ab98b33eae21dee122 | 83165 | |
| 4579108cda3cebc6432027a86e7b7a9b | 12326 | 3.4% |
| b60eaab7c709450be3cba1c56615936c | 10295 | 2.9% |
| a24d116ab001c6148d20035da2529014 | 10295 | 2.9% |
| ca2ef02cb2d2c48858ceb3137e44019d | 10295 | 2.9% |
| bac11b3e0f44cd6f7fdbe64b8961f1d4 | 10295 | 2.9% |
| 0e0a578022c8fef3fb155e5713e8195c | 10295 | 2.9% |
| 072b707ef80f7c15338cf1fd7c7212aa | 10290 | 2.9% |
| 218e10d0dff11714c4062e870cc733ae | 10290 | 2.9% |
| Other values (9172) | 36038 | 10.0% |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| 620f0b67a91f7f74151bc5be745b7110 | 157453 | |
| 0829f71740aab1ab98b33eae21dee122 | 83165 | |
| 4579108cda3cebc6432027a86e7b7a9b | 12326 | 3.4% |
| b60eaab7c709450be3cba1c56615936c | 10295 | 2.9% |
| a24d116ab001c6148d20035da2529014 | 10295 | 2.9% |
| ca2ef02cb2d2c48858ceb3137e44019d | 10295 | 2.9% |
| bac11b3e0f44cd6f7fdbe64b8961f1d4 | 10295 | 2.9% |
| 0e0a578022c8fef3fb155e5713e8195c | 10295 | 2.9% |
| 218e10d0dff11714c4062e870cc733ae | 10290 | 2.9% |
| 072b707ef80f7c15338cf1fd7c7212aa | 10290 | 2.9% |
| Other values (9172) | 36038 | 10.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 1458501 | |
| 7 | 1242407 | |
| b | 1101982 | |
| 0 | 926630 | 8.0% |
| f | 802065 | 6.9% |
| e | 726248 | 6.3% |
| 2 | 724404 | 6.3% |
| a | 709087 | 6.1% |
| 5 | 690820 | 6.0% |
| 4 | 617735 | 5.3% |
| Other values (6) | 2553305 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 7448121 | |
| Lowercase Letter | 4105063 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 1458501 | |
| 7 | 1242407 | |
| 0 | 926630 | |
| 2 | 724404 | |
| 5 | 690820 | |
| 4 | 617735 | |
| 9 | 517054 | 6.9% |
| 6 | 515823 | 6.9% |
| 8 | 391529 | 5.3% |
| 3 | 363218 | 4.9% |
Lowercase Letter
| Value | Count | Frequency (%) |
| b | 1101982 | |
| f | 802065 | |
| e | 726248 | |
| a | 709087 | |
| c | 477312 | |
| d | 288369 | 7.0% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 7448121 | |
| Latin | 4105063 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 1458501 | |
| 7 | 1242407 | |
| 0 | 926630 | |
| 2 | 724404 | |
| 5 | 690820 | |
| 4 | 617735 | |
| 9 | 517054 | 6.9% |
| 6 | 515823 | 6.9% |
| 8 | 391529 | 5.3% |
| 3 | 363218 | 4.9% |
Latin
| Value | Count | Frequency (%) |
| b | 1101982 | |
| f | 802065 | |
| e | 726248 | |
| a | 709087 | |
| c | 477312 | |
| d | 288369 | 7.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 11553184 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 1458501 | |
| 7 | 1242407 | |
| b | 1101982 | |
| 0 | 926630 | 8.0% |
| f | 802065 | 6.9% |
| e | 726248 | 6.3% |
| 2 | 724404 | 6.3% |
| a | 709087 | 6.1% |
| 5 | 690820 | 6.0% |
| 4 | 617735 | 5.3% |
| Other values (6) | 2553305 |
| Distinct | 10 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 35764.22893 |
| Minimum | 4096 |
|---|---|
| Maximum | 286720 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.8 MiB |
Quantile statistics
| Minimum | 4096 |
|---|---|
| 5-th percentile | 4096 |
| Q1 | 4096 |
| median | 4096 |
| Q3 | 8192 |
| 95-th percentile | 270336 |
| Maximum | 286720 |
| Range | 282624 |
| Interquartile range (IQR) | 4096 |
Descriptive statistics
| Standard deviation | 76828.9883 |
|---|---|
| Coefficient of variation (CV) | 2.148207597 |
| Kurtosis | 4.484647564 |
| Mean | 35764.22893 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 2.447244966 |
| Sum | 1.291220992 × 1010 |
| Variance | 5902693443 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) |
| 4096 | 204310 | |
| 8192 | 84556 | |
| 286720 | 13462 | 3.7% |
| 163840 | 10295 | 2.9% |
| 270336 | 10295 | 2.9% |
| 192512 | 10295 | 2.9% |
| 65536 | 10295 | 2.9% |
| 12288 | 10295 | 2.9% |
| 28672 | 6895 | 1.9% |
| 212992 | 339 | 0.1% |
| Value | Count | Frequency (%) |
| 4096 | 204310 | |
| 8192 | 84556 | |
| 12288 | 10295 | 2.9% |
| 28672 | 6895 | 1.9% |
| 65536 | 10295 | 2.9% |
| 163840 | 10295 | 2.9% |
| 192512 | 10295 | 2.9% |
| 212992 | 339 | 0.1% |
| 270336 | 10295 | 2.9% |
| 286720 | 13462 | 3.7% |
| Value | Count | Frequency (%) |
| 286720 | 13462 | 3.7% |
| 270336 | 10295 | 2.9% |
| 212992 | 339 | 0.1% |
| 192512 | 10295 | 2.9% |
| 163840 | 10295 | 2.9% |
| 65536 | 10295 | 2.9% |
| 28672 | 6895 | 1.9% |
| 12288 | 10295 | 2.9% |
| 8192 | 84556 | |
| 4096 | 204310 |
| Distinct | 364 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 33554.18039 |
| Minimum | 123 |
|---|---|
| Maximum | 283074 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.8 MiB |
Quantile statistics
| Minimum | 123 |
|---|---|
| 5-th percentile | 431 |
| Q1 | 1787 |
| median | 3062 |
| Q3 | 7782 |
| 95-th percentile | 269833 |
| Maximum | 283074 |
| Range | 282951 |
| Interquartile range (IQR) | 5995 |
Descriptive statistics
| Standard deviation | 76880.4531 |
|---|---|
| Coefficient of variation (CV) | 2.291233229 |
| Kurtosis | 4.444827051 |
| Mean | 33554.18039 |
| Median Absolute Deviation (MAD) | 2040 |
| Skewness | 2.439407684 |
| Sum | 1.211430063 × 1010 |
| Variance | 5910604069 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 3038 | 28026 | 7.8% |
| 4751 | 19847 | 5.5% |
| 571 | 16854 | 4.7% |
| 2302 | 16713 | 4.6% |
| 4728 | 16156 | 4.5% |
| 1846 | 16091 | 4.5% |
| 7978 | 13592 | 3.8% |
| 282996 | 12265 | 3.4% |
| 161694 | 10295 | 2.9% |
| 64664 | 10295 | 2.9% |
| Other values (354) | 200903 |
| Value | Count | Frequency (%) |
| 123 | 26 | < 0.1% |
| 132 | 17 | < 0.1% |
| 180 | 880 | |
| 232 | 20 | < 0.1% |
| 249 | 663 | |
| 259 | 422 | |
| 262 | 472 | |
| 268 | 234 | 0.1% |
| 276 | 135 | < 0.1% |
| 311 | 939 |
| Value | Count | Frequency (%) |
| 283074 | 1197 | 0.3% |
| 282996 | 12265 | |
| 269833 | 10295 | |
| 210737 | 339 | 0.1% |
| 190882 | 10295 | |
| 161694 | 10295 | |
| 64664 | 10295 | |
| 27856 | 6895 | |
| 10428 | 10295 | |
| 8071 | 218 | 0.1% |
| Distinct | 212 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 893713.8976 |
| Minimum | 4096 |
|---|---|
| Maximum | 2068480 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.8 MiB |
Quantile statistics
| Minimum | 4096 |
|---|---|
| 5-th percentile | 167936 |
| Q1 | 712704 |
| median | 1060864 |
| Q3 | 1122304 |
| 95-th percentile | 1200128 |
| Maximum | 2068480 |
| Range | 2064384 |
| Interquartile range (IQR) | 409600 |
Descriptive statistics
| Standard deviation | 362645.281 |
|---|---|
| Coefficient of variation (CV) | 0.4057733487 |
| Kurtosis | 0.1088205522 |
| Mean | 893713.8976 |
| Median Absolute Deviation (MAD) | 73728 |
| Skewness | -1.234143403 |
| Sum | 3.226637844 × 1011 |
| Variance | 1.315115998 × 1011 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 4096 | 10295 | 2.9% |
| 167936 | 10295 | 2.9% |
| 172032 | 10295 | 2.9% |
| 176128 | 10295 | 2.9% |
| 180224 | 10295 | 2.9% |
| 450560 | 10295 | 2.9% |
| 454656 | 10295 | 2.9% |
| 647168 | 10295 | 2.9% |
| 712704 | 10295 | 2.9% |
| 724992 | 10295 | 2.9% |
| Other values (202) | 258087 |
| Value | Count | Frequency (%) |
| 4096 | 10295 | |
| 167936 | 10295 | |
| 172032 | 10295 | |
| 176128 | 10295 | |
| 180224 | 10295 | |
| 450560 | 10295 | |
| 454656 | 10295 | |
| 647168 | 10295 | |
| 712704 | 10295 | |
| 724992 | 10295 |
| Value | Count | Frequency (%) |
| 2068480 | 2 | < 0.1% |
| 2052096 | 2 | < 0.1% |
| 2007040 | 2 | < 0.1% |
| 1953792 | 2 | < 0.1% |
| 1826816 | 2 | < 0.1% |
| 1822720 | 2 | < 0.1% |
| 1814528 | 2 | < 0.1% |
| 1798144 | 2 | < 0.1% |
| 1794048 | 6 | |
| 1789952 | 2 | < 0.1% |
| Distinct | 15918 |
|---|---|
| Distinct (%) | 4.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.8 MiB |
| .rdata | 20590 |
|---|---|
| .text | 10295 |
| .crt1 | 10295 |
| .data | 10295 |
| .pdata | 10295 |
| Other values (15913) |
Length
| Max length | 7 |
|---|---|
| Median length | 6 |
| Mean length | 5.444513997 |
| Min length | 4 |
Characters and Unicode
| Total characters | 1965671 |
|---|---|
| Distinct characters | 30 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 6369 ? |
|---|---|
| Unique (%) | 1.8% |
Sample
| 1st row | .text |
|---|---|
| 2nd row | .rdata |
| 3rd row | .crt1 |
| 4th row | .rdata |
| 5th row | .data |
Common Values
| Value | Count | Frequency (%) |
| .rdata | 20590 | 5.7% |
| .text | 10295 | 2.9% |
| .crt1 | 10295 | 2.9% |
| .data | 10295 | 2.9% |
| .pdata | 10295 | 2.9% |
| qwTG | 10295 | 2.9% |
| .rsrc | 10295 | 2.9% |
| .reloc | 10295 | 2.9% |
| .lqen | 9890 | 2.7% |
| .vqb | 9890 | 2.7% |
| Other values (15908) | 248602 |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| rdata | 20590 | 5.7% |
| qwtg | 10295 | 2.9% |
| reloc | 10295 | 2.9% |
| rsrc | 10295 | 2.9% |
| text | 10295 | 2.9% |
| pdata | 10295 | 2.9% |
| crt1 | 10295 | 2.9% |
| data | 10295 | 2.9% |
| lqen | 9890 | 2.7% |
| gjd | 9890 | 2.7% |
| Other values (15908) | 248602 |
Most occurring characters
| Value | Count | Frequency (%) |
| . | 350742 | |
| a | 120041 | 6.1% |
| r | 112921 | 5.7% |
| t | 110130 | 5.6% |
| q | 98491 | 5.0% |
| d | 82776 | 4.2% |
| c | 72871 | 3.7% |
| l | 71620 | 3.6% |
| e | 68716 | 3.5% |
| w | 59080 | 3.0% |
| Other values (20) | 818283 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1584044 | |
| Other Punctuation | 350742 | 17.8% |
| Uppercase Letter | 20590 | 1.0% |
| Decimal Number | 10295 | 0.5% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 120041 | 7.6% |
| r | 112921 | 7.1% |
| t | 110130 | 7.0% |
| q | 98491 | 6.2% |
| d | 82776 | 5.2% |
| c | 72871 | 4.6% |
| l | 71620 | 4.5% |
| e | 68716 | 4.3% |
| w | 59080 | 3.7% |
| g | 55936 | 3.5% |
| Other values (16) | 731462 |
Uppercase Letter
| Value | Count | Frequency (%) |
| G | 10295 | |
| T | 10295 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 350742 |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 10295 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1604634 | |
| Common | 361037 | 18.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 120041 | 7.5% |
| r | 112921 | 7.0% |
| t | 110130 | 6.9% |
| q | 98491 | 6.1% |
| d | 82776 | 5.2% |
| c | 72871 | 4.5% |
| l | 71620 | 4.5% |
| e | 68716 | 4.3% |
| w | 59080 | 3.7% |
| g | 55936 | 3.5% |
| Other values (18) | 752052 |
Common
| Value | Count | Frequency (%) |
| . | 350742 | |
| 1 | 10295 | 2.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1965671 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| . | 350742 | |
| a | 120041 | 6.1% |
| r | 112921 | 5.7% |
| t | 110130 | 5.6% |
| q | 98491 | 5.0% |
| d | 82776 | 4.2% |
| c | 72871 | 3.7% |
| l | 71620 | 3.6% |
| e | 68716 | 3.5% |
| w | 59080 | 3.0% |
| Other values (20) | 818283 |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
| df_index | Unnamed: 0 | filename | win_count | sha256 | imp_hash | sec_chi2 | sec_entropy | sec_md5 | raw_size | virtual_size | virtual_address | sec_name | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 890523 | 890523 | 2022041900/2022041900_10 | 169033 | 421a0e5ff1a6ef6b4a1c79a5fecd9555ef2e0376177ab8e50b22aae1fefd78c4 | 25c7ac00c91884fd2923a489ae9dfbca | 65485.40 | 7.82 | 072b707ef80f7c15338cf1fd7c7212aa | 163840 | 161694 | 4096 | .text |
| 1 | 890524 | 890524 | 2022041900/2022041900_10 | 169033 | 421a0e5ff1a6ef6b4a1c79a5fecd9555ef2e0376177ab8e50b22aae1fefd78c4 | 25c7ac00c91884fd2923a489ae9dfbca | 957362.25 | 0.48 | 9f46fc5a3fcb244f69598b159edfecd8 | 4096 | 3913 | 167936 | .rdata |
| 2 | 890525 | 890525 | 2022041900/2022041900_10 | 169033 | 421a0e5ff1a6ef6b4a1c79a5fecd9555ef2e0376177ab8e50b22aae1fefd78c4 | 25c7ac00c91884fd2923a489ae9dfbca | 498667.88 | 2.78 | a24d116ab001c6148d20035da2529014 | 4096 | 1787 | 172032 | .crt1 |
| 3 | 890526 | 890526 | 2022041900/2022041900_10 | 169033 | 421a0e5ff1a6ef6b4a1c79a5fecd9555ef2e0376177ab8e50b22aae1fefd78c4 | 25c7ac00c91884fd2923a489ae9dfbca | 273516.75 | 4.04 | ca2ef02cb2d2c48858ceb3137e44019d | 4096 | 3264 | 176128 | .rdata |
| 4 | 890527 | 890527 | 2022041900/2022041900_10 | 169033 | 421a0e5ff1a6ef6b4a1c79a5fecd9555ef2e0376177ab8e50b22aae1fefd78c4 | 25c7ac00c91884fd2923a489ae9dfbca | 4015190.25 | 6.67 | 218e10d0dff11714c4062e870cc733ae | 270336 | 269833 | 180224 | .data |
| 5 | 890528 | 890528 | 2022041900/2022041900_10 | 169033 | 421a0e5ff1a6ef6b4a1c79a5fecd9555ef2e0376177ab8e50b22aae1fefd78c4 | 25c7ac00c91884fd2923a489ae9dfbca | 940514.50 | 0.54 | bac11b3e0f44cd6f7fdbe64b8961f1d4 | 4096 | 2886 | 450560 | .pdata |
| 6 | 890529 | 890529 | 2022041900/2022041900_10 | 169033 | 421a0e5ff1a6ef6b4a1c79a5fecd9555ef2e0376177ab8e50b22aae1fefd78c4 | 25c7ac00c91884fd2923a489ae9dfbca | 37923.03 | 7.87 | f096c568fbd7d92957f53fd1a92776c1 | 192512 | 190882 | 454656 | qwTG |
| 7 | 890530 | 890530 | 2022041900/2022041900_10 | 169033 | 421a0e5ff1a6ef6b4a1c79a5fecd9555ef2e0376177ab8e50b22aae1fefd78c4 | 25c7ac00c91884fd2923a489ae9dfbca | 2810188.00 | 4.09 | 0e0a578022c8fef3fb155e5713e8195c | 65536 | 64664 | 647168 | .rsrc |
| 8 | 890531 | 890531 | 2022041900/2022041900_10 | 169033 | 421a0e5ff1a6ef6b4a1c79a5fecd9555ef2e0376177ab8e50b22aae1fefd78c4 | 25c7ac00c91884fd2923a489ae9dfbca | 134540.95 | 5.14 | b60eaab7c709450be3cba1c56615936c | 12288 | 10428 | 712704 | .reloc |
| 9 | 890532 | 890532 | 2022041900/2022041900_10 | 169033 | 421a0e5ff1a6ef6b4a1c79a5fecd9555ef2e0376177ab8e50b22aae1fefd78c4 | 25c7ac00c91884fd2923a489ae9dfbca | 73113600.00 | 0.00 | 4579108cda3cebc6432027a86e7b7a9b | 286720 | 282996 | 724992 | .lqen |
Last rows
| df_index | Unnamed: 0 | filename | win_count | sha256 | imp_hash | sec_chi2 | sec_entropy | sec_md5 | raw_size | virtual_size | virtual_address | sec_name | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 361027 | 5674743 | 5674743 | 2022042101/2022042101_45 | 615253 | 12f8bc0299b6f37f90cfd52d97a13ba0a132049ff52f4ec58d8a47b89b2c0639 | 25c7ac00c91884fd2923a489ae9dfbca | 1044480.00 | 0.00 | 620f0b67a91f7f74151bc5be745b7110 | 4096 | 318 | 1163264 | .ures |
| 361028 | 5674744 | 5674744 | 2022042101/2022042101_45 | 615253 | 12f8bc0299b6f37f90cfd52d97a13ba0a132049ff52f4ec58d8a47b89b2c0639 | 25c7ac00c91884fd2923a489ae9dfbca | 1044480.00 | 0.00 | 620f0b67a91f7f74151bc5be745b7110 | 4096 | 3038 | 1167360 | .ycwyx |
| 361029 | 5674745 | 5674745 | 2022042101/2022042101_45 | 615253 | 12f8bc0299b6f37f90cfd52d97a13ba0a132049ff52f4ec58d8a47b89b2c0639 | 25c7ac00c91884fd2923a489ae9dfbca | 1044480.00 | 0.00 | 620f0b67a91f7f74151bc5be745b7110 | 4096 | 3038 | 1171456 | .klsrvp |
| 361030 | 5674746 | 5674746 | 2022042101/2022042101_45 | 615253 | 12f8bc0299b6f37f90cfd52d97a13ba0a132049ff52f4ec58d8a47b89b2c0639 | 25c7ac00c91884fd2923a489ae9dfbca | 2088960.00 | 0.00 | 0829f71740aab1ab98b33eae21dee122 | 8192 | 4388 | 1175552 | .bjcea |
| 361031 | 5674747 | 5674747 | 2022042101/2022042101_45 | 615253 | 12f8bc0299b6f37f90cfd52d97a13ba0a132049ff52f4ec58d8a47b89b2c0639 | 25c7ac00c91884fd2923a489ae9dfbca | 1044480.00 | 0.00 | 620f0b67a91f7f74151bc5be745b7110 | 4096 | 571 | 1183744 | .omsph |
| 361032 | 5674748 | 5674748 | 2022042101/2022042101_45 | 615253 | 12f8bc0299b6f37f90cfd52d97a13ba0a132049ff52f4ec58d8a47b89b2c0639 | 25c7ac00c91884fd2923a489ae9dfbca | 73113600.00 | 0.00 | 4579108cda3cebc6432027a86e7b7a9b | 286720 | 283074 | 1187840 | .lts |
| 361033 | 5674749 | 5674749 | 2022042101/2022042101_45 | 615253 | 12f8bc0299b6f37f90cfd52d97a13ba0a132049ff52f4ec58d8a47b89b2c0639 | 25c7ac00c91884fd2923a489ae9dfbca | 1044480.00 | 0.00 | 620f0b67a91f7f74151bc5be745b7110 | 4096 | 730 | 1474560 | .ceki |
| 361034 | 5674750 | 5674750 | 2022042101/2022042101_45 | 615253 | 12f8bc0299b6f37f90cfd52d97a13ba0a132049ff52f4ec58d8a47b89b2c0639 | 25c7ac00c91884fd2923a489ae9dfbca | 73113600.00 | 0.00 | 4579108cda3cebc6432027a86e7b7a9b | 286720 | 283074 | 1478656 | .mub |
| 361035 | 5674751 | 5674751 | 2022042101/2022042101_45 | 615253 | 12f8bc0299b6f37f90cfd52d97a13ba0a132049ff52f4ec58d8a47b89b2c0639 | 25c7ac00c91884fd2923a489ae9dfbca | 1044480.00 | 0.00 | 620f0b67a91f7f74151bc5be745b7110 | 4096 | 431 | 1765376 | .xer |
| 361036 | 5674752 | 5674752 | 2022042101/2022042101_45 | 615253 | 12f8bc0299b6f37f90cfd52d97a13ba0a132049ff52f4ec58d8a47b89b2c0639 | 25c7ac00c91884fd2923a489ae9dfbca | 534986.25 | 2.47 | 3bdc7903f23377c192129423e06221b9 | 4096 | 1395 | 1769472 | .lgp |